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We propose a recursive algorithm for the calculation of multi-baryon cor- 
relation functions that combines the advantages of a recursive approach with 
those of the recently proposed unified contraction algorithm. The inde- 
pendent components of the correlators are built recursively by adding the 
baryons one after the other in a given order. The list of nonzero independent 

r^i components is also constructed in a recursive manner, significantly reducing 

the resources required for this step. We computed the number of operations 
required to calculate the correlators up to 8 Be, and observed a significant 
speedup compared to other techniques. For the calculation of 4 He and 8 Be 

qq correlation functions in the fully relativistic case O(10 8 ) operations are re- 

quired, whereas for non-relativistic operators this number can be reduced to 

t-h e.g. O(10 4 ) in the case of 4 He. 

O 

CO 

> 1 Introduction 

X 

Quantum Chromodynamics (QCD), the theory of the strong interaction was first intro- 
duced to describe the strong nuclear binding forces. Given this fact QCD is expected to 
be able to predict the masses and properties of atomic nuclei. Due to the strong coupling 
at low energies non-perturbative techniques such as lattice QCD (LQCD) are required 
to study bound states in QCD. In principle the tools for such studies are at hand and 
several calculations in order to examine light nuclei [l}j3] and the nuclear force [3}j6] 
have been performed recently. However, the enormous amount of Wick-contractions 
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necessary for the evaluation of correlation functions of atomic nuclei is a severe problem 
in such calculations. 

The number of Wick-contractions for the naive evaluation of correlation functions of 
multi-baryon systems scales as n u \n d \n s \, where n u , n d and n s are the number of u, d 
and s quarks in the system, respectively. Furthermore, for each Wick-contraction one 
has to evaluate the sum over all color and spin indices. These sums scale exponen- 
tially with the number of baryons in the system. As a consequence, the total number 
of required operations scales as n u \ n&\ n s \ 6 A 4 A , where A is the atomic mass number. 
The introduction of more complicated spatial baryonic wave-functions will increase this 
number further. 

Recently there has been substantial progress in reducing this computational challenge. 
An algorithm proposed in Ref. [7] reduces the computational cost by transforming the 
factorially scaling task of calculating Wick-contractions into the polynomially scaling 
task of calculating determinants. In [l] the number of contractions has been reduced 
significantly by exploiting the permutation symmetry of the quark operators. A further 
improvement has been achieved in [8], where the combined permutations of color and 
spin indices are used to create a unified list of independent contractions. While this 
method reduces the amount of contractions to be evaluated on each gauge configuration 
significantly, the creation of the list of independent contractions remains difficult. This 
is due to the fact that the full set of possible contractions, which scale factorially and 
exponentially in the number of quarks, has to be applied once to determine the coeffi- 
cients in the list. For small systems it is possible to carry out this calculation once, but 
it becomes quickly impractical for larger systems. 

For the related case of systems comprised of a large number of mesons recursive relations 
were derived in Ref. (9|, which expose linear scaling for a fixed number of sources and 
polynomial behavior in the number of sources. The generalization of this method to 
systems of protons and neutrons is possible and results in a procedure which can be 
shown to scale polynomially in the number of nucleons and exponentially in the number 
of quark sources. Such an approach has the disadvantage that the resulting recursion re- 
lations are relatively complicated. Furthermore, the projection of the individual baryons 
to definite momentum or the introduction of additional quark sources remains difficult. 

The purpose of this paper is to propose an efficient method for the calculation of baryonic 
correlation functions of the form 

IN n \ 

[C^plt^f!,^, • • • ,x N ,t) = ( ]jB ak (x k ,t) lp° ; (0,0) } (1.1) 

\k=i i=i I 

that combines the advantages of the recursive approach with those of the algorithm 
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introduced in [8] . The interpolating baryon operators are of the form 

B a = e abc (r 1 ) a/5 t(gi) i8;a [(g 2 )7;6(r 2 ) 7 5(g3)5;c], (1.2a) 
B a = e abc (I^fo)*" [(q 2 y^r 2 y 5 (q 3 ) s % (1.2b) 

where the quark operators q n G {u, d, s} and g n G {«, d, s} are all taken at the same 
spacetime point. Here and throughout in the paper Latin indices correspond to color 
degrees of freedom (DoFs) while Greek indices correspond to the spin DoFs associated 
with the quark operators. For notational convenience all upper indices correspond to 
quark operators at the source while lower indices correspond to quark operators at the 
sink. The choice of Ti = 1 and T 2 = C75 yields the interpolating operators for the 
proton with (g 1; q 2 , (73) = (u, u, d) and for the neutron with (g 1; q 2 , (73) = (d, u, d). 

The paper is organized as follows. First we review the unified contraction algorithm 
in Section [2} In Section [3] a method is introduced to construct antisymmetric tensors 
out of small building blocks in a recursive way. This procedure is applied in Section [4] 
to construct correlation functions of multi-baryon systems with one quark source and 
one baryon sink. In Section [5] a method is described to reduce the number of necessary 
operations when only the projection of the correlation function to a certain angular 
momentum state is of interest. This is followed by the generalization of the method 
to an arbitrary number of quark sources and baryon sinks in Section [6j which allows 
the calculation of arbitrarily complex correlation functions. The case of atomic nuclei 
is discussed in detail in Section [7j Finally, after comparing the efficiency of our method 
with that of other recent algorithms in Section [8] we conclude in Section [9] 



2 The unified contraction algorithm 

To provide a self-contained presentation, we briefly review in this section the unified 
contraction algorithm introduced in Ref. [8]. For the construction of multi-baryon cor- 
relation functions it is useful to define blocks of quark propagators which correspond to 
the contractions of three quarks at the source with a baryon at the sink. This blocking 
procedure, which was successfully used both for the study of light nuclei [T[j3] and for 
the study of several-nucleon forces |4]-j6], has several advantages. First it already reduces 
the number of contractions to evaluate. Second it allows to carry out the projection of 
individual baryons to definite momentum or to introduce different baryon sinks prior to 
the expensive calculation of the correlation function. The blocks are generally defined 
as 

6; a, 0, 7; a, b,c) = J2 {b*% t) ■ q^^) ■ (2-1) 

X 

Here 5 is the spin index of the baryon B, a, (3 and 7 are the spin indices of the three 
quarks qi, q 2 and q%, and a, b, c are the corresponding color indices. The form of all 
three quark source operators are taken to be the same. The function s(x) characterizes 
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the form of the baryon sink. A common choice is the projection to zero momentum 
s(x) oc 1, which is often needed e.g. when the mass of a bound state is to be extracted 
from a correlation function. For the moment it is assumed that the sink function is the 
same for all baryons and the case of different sinks is discussed later. For notational 
convenience the 4 spinor and 3 color degrees of freedom associated with a quark of a 
given flavor can be combined to form spinor-color indices which can take the values 
1,2,..., 12. Using these combined indices a block can be written as 

f$»«>{t,6tf*\#»\^) = £>(f) (B 3 (x,t) ■ q f n \i q2 \f 3) ). (2.2) 

X 

In the case of a system consisting of protons q% = q2 = u, therefore, the above tensor is 
antisymmetric in the indices ^ Q1 ^ and ^ 92 ^. 

Using the above defined blocks the correlation function of iV baryons can be expressed 
a£] 

Erqimm r+ a ■ Aqi) t(<?2) Aqs)\ fqimmz-i- x .An) Ait) An)\ 
J Bi >S2 5 S3 J ■ ■ ■ J B N V L ' °Ni S37V-2; ?3JV-1> S37V I 

• G B ' («H Z% ■ ■ ■ G B H*n; ^n-^^n-x^n)) sgn(a), (2.3) 

where the objects Gb are combinations of T-matrices and e-tensors suitable for a baryon 
B: 

G B ( a ; £ (9l) ,£ (92) ,£ (93) ) := (ri)^ (5<9l)) (r2)^ (e(92)) ^ K( " 3)) e c(e( " l))c(5(,?2))c(€<93)) . (2.4) 

Here /?(£) is the spin-index part of £ and c(£) is the color- index part and £ is the set of 
all permutations that permute the indices associated with the different quark flavors qu 
separately. The product of blocks J^ 1 ' 92 ' 93 does not depend on the permutations a and 
hence the correlation function can be written in the form 

\r<( N )] ai ' a2 >---' a N (+\ _ r . An) A12) Aqz)\ fnmmi+ x ■ A qi ) c( q2 ) Ai3)\ 

[° \8 U 5 2 ,...,5 N \ 1 ) — Jb 1 ;S2 > S3 ) • • • J B N I 1 ' °N, S3AT-2; S3JV-1) S3V ) 

r(ff)/„ n .An) AQ2) Aq3) An) Aw) Am)\ in r\ 

with the tensor 

r(JV)/„ ^ . t(«0 ^(93) tfe) t(«3)\ _ 

^1, . . . , «7V, £1 , £2 5 S3 5 • • • 5 S3V-25 S3AT-15 S3Af J — 

£ g Bi («i; • • • gBjv («^ e^-!,, e&) Bgu^. (2.6) 

In the unified contraction algorithm the object L is generated by explicitly performing all 

use the Einstein summation convention throughout the paper, that is, over each index appearing 
twice the sum is automatically understood. 
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permutations according to eqn. (2.6). Since the GVs are very sparsely populated tensors 
in most cases L is also sparse. It is then proposed to consider only those components 
of the product Z^' 92 ' 93 . . . /J? 1 ^ 92 ' 93 which are contracted with the non-zero components of 



3 Recursive construction of antisymmetric tensors 

The object L has a high degree of symmetry which reduces the number of its independent 
components. From the definition it is straightforward to see that L is antisymmetric 
under the exchange of two indices £ as long as they belong to the same quark flavor. 
It also possesses a number of spin indices a\ . . . corresponding to baryons of types 
Bi, B 2 , . . . , B N . From the Pauli-principle it follows that the correlator [C^]"*'**]'"'" 1 * (t) 
has to be antisymmetric under the exchange of any two indices a corresponding to the 
same type of baryon. Hence the same property has to hold for L. 

It can be shown using only this antisymmetric property that the maximal number of 
independent components e.g. in the case of 4 He is 30735936, whereas in the case of 8 Be 
it is 1. Since the objects G are very sparse, many of these components are expected to 
be zero. 

A component of a tensor A(£i, £2, • • • , £z), which is antisymmetric in the indices £1 , £2, • • • , £z, 
each ranging from 1 to k, can be uniquely defined by a fc-tuple A{£} = (n(l), n(2), . . . , n(k)), 
where n(i) denotes how often the value % occurs amongst the / indices in the set 
{£} = {£1, £2, . . . , £2}. As a consequence of the antisymmetry all components with 
n(i) > 1 vanish. The component associated with such a tuple is the component where 
all values i for which n(i) = 1 occur amongst the indices in asceding order. All other 
components can be constructed using permutations of the indices and taking the sign 
of the permutation into account. For example, if A is a tensor with three antisym- 
metric indices, each ranging from one to four, the tuple A{£} = (1, 0, 1, 1) corresponds 
to the component A(l,3,4). If a tensor is antisymmetric in several groups of indices 
independently, then several independent tuples can be defined, one for each group of 
indices. 

If A is an antisymmetric tensor with k indices and Y is an antisymmetric tensor with / 
indices, then their antisymmetric product Z = X»Y is a tensor with k + l antisymmetric 
indices, whose components are defined a^] 

(A . Y){z) := Z(z) = X ( x ) Y (v) sgnO%), (3.1) 

z=x+y 



2 In the definition of this product the normalization factors have been removed deliberately to speed 
up the computation. These factors will be reintroduced when the correlation function is calculated. 
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where the tuples 

z = A{£ 1 ,...,Z k+l } (3.2) 

x = A{S u ...,S k } (3.3) 

y = (3.4) 

identify the antisymmetric components and 

sgn(a;|y) = n(-l) Xi (3.5) 

i>j 

is the sign of the permutation that is necessary to bring the indices of the tensors X and 
Y into asceding order. 

If each tensor has r independent groups of antisymmetric indices then each such group 
is described by an individual tuple. In this case the antisymmetrized product can be 
written as 

(X.Y)( Zl ,z 2 ,...,z r ) : = 

Z(z 1 ,z 2 ,...,z r ) = X(x 1 ,x 2 ,...,x r )Y(y 1 ,y 2 ,...,y r )x 

z 1 =x 1 +y 1 
z 2 =x 2 +y 2 

z r =x r -\-y r 

x sgn(cci|y 1 ) sgn(x 2 \y 2 ) . . . sgn(x r \y r ). (3.6) 

In the following it will be often required to antisymmetrize only the subset of the quark 
color-spinor indices £ that corresponds to a given quark flavor q. In this case it will be 
useful to write A^{£i, £ 2 , • • • , £«} for the tuple of indices associated with the respective 
quark flavor. In the case of the spinor indices a and S of the baryons a similar notation 
is adopted: here A^{ai, a 2 . . . a n } is the tuple associated with the antisymmetrization 
of only those spinor indices that correspond to the baryon type B. 

In the special case where an antisymmetric tensor can be written as = Yi»Y 2 »- ■ ••Y n , 
a recursion relation X® = X^~^»Yi can be set up with the starting condition X^ = Y\. 
The usage of this recursion relation is often much more efficient than the direct evaluation 
of the product Y\ • Y 2 • • • • • Y n . Assuming that there are r groups of antisymmetric 
indices, each index in the p-th group can take values from 1 to m p , and in the p-th 
group at the stage there are n p indices, then the number of components at the 
intermediate step is 

r 

pW = l[C(n p ,m p -n p ), (3.7a) 
P =i 

where C(n 1; n 2 , . . . , n m ) = (ni + n 2 + . . . + n m )\ / rii\n 2 \ . . . n m \ are the multinomial coef- 
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Figure 1: The computational savings of 
the recursive algorithm in the case when 
only one antisymmetric component of the 
result X^ is of interest. has one 

group of antisymmetric indices, and each 
Yi possesses only one index. The boxes 
in the three columns represent the anti- 
symmetric compontnes of the respective 
stages X {1 \ X^ and X@\ The num- 
ber in each box corresponds to the tuple 
associated with the antisymmetric com- 
ponent. The arrows show which com- 
ponents are used to construct the com- 
ponents of the next stage. If the gray 
boxes in the tensor X^ are not needed, 
then the gray boxes in all other tensors 
are also not needed. Only the solid black 
operations have to be performed and the 
gray operations can be omitted. 

ficients. The number of operations required to go from X® to X^ t+1 ^ is 

r 

qM = Y]_C{n p ,l p ,m p -n p -l p ), (3.7b) 

P =i 

where l p is the number of indices in the p-th index group of 5^+i. 

A further reduction of the computational effort can be achieved when not all components 
of X^ at the final stage are of interest. When evaluating the product X^ = X^ 1 ) • 
Y n only those summands have to be considered which contribute to a component of 
interest in X^ n \ Using this property not only the computational effort of evaluating 
this particular product can be reduced, but also some of the components of X*™ -1 ) may 
not be required for the evaluation of the product at all. Therefore, these components of 
are not of interest and need not be computed in the previous recursion step. This 
argumentation can then be repeated for all recursion steps and often leads to a significant 
reduction of computational effort. The procedure is demonstrated in Figure [T] for the 
case where X^ has only one group of antisymmetric indices, each index ranging from 
1 to 4. The tensors Yi here possess only one index of the same format. If it is assumed 
that at the final stage only the black component of X^ is of interest then a significant 
reduction in the computational effort is achieved. 

If it is known in advance which components of X^ are of interest, then it is useful to 
determine which components of X*™ -1 ) are required for their computation and to make 
a list A*™ -1 ) of the required operations. This procedure can be repeated successively for 
all previous intermediate steps until X*- 1 ) is reached. For the computation of the desired 
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components of one then has to perform only the operations contained in the lists 

aWaC 2 ),...^"- 1 ). 

In practice the lists AW are usually much larger than the tensors X w . Therefore, the 
memory requirement of the resulting algorithm is dominated by these lists. This problem 
can be circumvented by storing at each stage i only the list of the required components of 
X^K Then the reconstruction of the list of operations A^ -1 ) introduces a relatively small 
overhead, but the amount of memory used by the algorithm is reduced significantly. 



4 Correlation functions with one quark source/sink 

Using the notation from the previous section the tensor L can be written in the forrrj^] 

L(A^{a}, A™{a}, . . . , A^tf}, A«{£}). (4.1) 

Here B a , Bb, . . . are the different types of baryons in the system. Using a similar argu- 
mentation the object G B can be written as 

G B (a,A^{0,A^{0,A^{0). (4.2) 

Although the index a is just a single spin index, it can be expressed through 4-tuples 
A^ Ba ^{a}, A^ Bb ^{a}, ... to bring G B into the same form as L. Out of these tuples only 
the one corresponding to the baryon B will have a single nonzero entry. 



The tensors L^ n ' defined in eqn. (2.6) corresponding to n baryons fulfill the recursion 
relation 

L (n+1) = L (n) . (43) 

with the starting condition 

L {1) = G Bl . (4.4) 
Here "•" denotes the antisymmetric product with multiple groups of antisymmetric 



indices as defined in equation (3.6). In general Gb„ can be a different tensor describing 
a different type of baryon for each n. The objects Gb % are products of e-tensors and 
T-matrices and are therefore often sparse. Hence the evaluation of the above recursion 
can be done very efficiently if only the nonzero components are stored. 



According to eqn. (2.5) the correlation function for a given gauge configuration is ob- 



tained by evaluating the contraction of L with the product 

f( n )(A A .•/•.£•(?!) t(«0 £•(©) An) A<&) Ai-a)\ 

\ u li ■ ■ ■ u Ni u i SI i S2 > S3 5 ■ ■ ■ 5 S3JV— 2' S3JV— 1> s; 



3iV 

JB! lMl>£l >?2 'S3 )"'JB N IMiV, ?3iV-2)^3Ar-l)?3JV )■ l 4 -^ 



3 The generalization to systems with additional quark flavors is straightforward. For notational conve- 
nience we restrict ourself here to only u-, d- and s-quarks. 
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To do so one could in principle construct all the components of L explicitly. However, it 
is computationally more efficient to exploit the antisymmetry of L directly: The tensor F 
can be projected to a tensor F_ which is antisymmetric in all indices corresponding to the 
same quark flavor or the same baryon. Only this antisymmetric projection contributes 
to the correlation function, and therefore, only the contraction between L and F_ has 
to be evaluated. 

F_ possesses the same antisymmetry structure as L , thus it can be written in the form 

F^(A^{8}, A {Bb) {5}, . . . , A {d) {£}, A (s) {£})- (4.6) 

Since F_ is composed of the independent factors f B 1,92 ' 9i , a similar recursion relation 

p{n+l) _ p{n) 9 #;gi,g2,g3 U 7) 

— — J B n+ i \ • ) 

with the starting condition F^ = f^' q2 ' q:i can be defined. In order to apply the above 
recursion relation one has to calculate explicitly the independent antisymmetric com- 
ponents of /J} 1 ' 92 ' 93 by applying all possible permutations of the combined spinor-color 
indices. Since there are only three such indices in each factor, the computational effort 
associated with these antisymmetrizations can be neglected compared to the evaluation 



of the recursion steps of (4.7). 



Once both F_ and L are ready, the correlation function can be extracted by performing 
the contraction 



C {N \t;A^{5},...,A^{a},...) 

= Jf E F {N \A^\5},...,A^{0,...)-LW(A^{a},...,A^{^},...) (4.8) 

with the normalization factor 

Af = n q Jn qb \...(n B Jn Bb \...) 2 , (4.9) 

where n qi is the number of quarks of flavor qi and Ub, is the number of baryons of type 
E>i in the system. 

The final result C^(A {Ba) {6}, . . . , A {u) {a}, . . .) itself is an antisymmetric tensor. All 
components can be reconstructed by taking into account the respective permutations to 
change the ordering of the indices. 

Due to the sparse nature of the computational cost of the determination of L( N > is 
very low. Furthermore, this construction has to be performed only once independent of 
any gauge configuration. The calculation of F_\m = 1,2, ... ,N is computationally 
much more demanding and has to be performed on each gauge configuration. The 
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maximal number of components at the intermediate stage F_ is given by the formula 
P(n%\n%\ ...) = nC(nW,4 - »£>) \[C{n%\ 12 - n M), ( 4 .10) 



where is the number of baryons E>i at the intermediate stage m and r4™ is the 
number of quarks of flavor qj at this stage. 

In a similar way the maximal number of operations required for adding the baryon Bk 
m the recursion step r_ — > rl is 



QB k (n Bi ,n B2 



n 



C(n£',0,4 



n 



C(n%\ 1,4-1 



n 



(m)x 



for z = A; 



n 



(m) 
hi j^- 
(m) 



C(nr.0.12 



n 



(m)>, 
9j y 



,N(qj,B k ),12 



n 



(m) 
9j 



for j ^ k 
N(q j: B k )) for j = fc, 



(4.11) 



where N(qj, Bk) is the number of quarks of flavor in the baryon of type -B*,. Formulae 



(4.10) and (4.11) are special cases of the more general relations (3.7a) and (3.7b) 



The estimates P and Q obtained above are only upper bounds on the real computational 
cost. Due to the large number of zero components of L^' large savings in comparison 
with these numbers are possible. During the recursive constructions of F_ , . . ., 

F_ only those components have to be calculated, which in the end contribute to a 
component contracted with a nonzero component of L^ N K To do so one calculates the 
lists Att.AC 2 ),...^- 1 ) using the procedure described in Section [3] Then only these 
operations have to be performed for each gauge configuration. 



5 Projection to angular momentum states 

The procedure described in the previous section allows the generation of all spinor com- 
ponents of the correlation function. However, in many cases not all spinor components 
are of interest, but the correlation function is to be projected to a definite angular mo- 
mentum state. It is often the case that not all components contribute to this projection 
and hence one wants to avoid the unnecessary computational effort. 

Let M. be an arbitrary tensor that projects the correlation function to the required spin 
state such that 

C M (t) = M%g^[Cl*]%g;£«{t) (5.1) 

is the desired projected correlation function. The correlation function C^ N \t) is a ten- 
sor with antisymmetric groups of indices, hence only the antisymmetric part of Ai 
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contributes to the resulting correlation function Cm(£). This antisymmetric part can be 
written as 

M-(A {B ^{a}, A {Bb) {a}, . . . , A^ Ba) {5}, A {Bb) {5}, . . .) (5.2) 
and a modified list 

{n Ba \n Bb \ . . .f L M (A^{6}, A™ {6}, . . . , A W {(}, A^tf}, A«{£}) 

H N \A^{a}, AW{a}, . . . , A^tf}, A^tf}, AW{£}) 

x 7W_(A (Ba) {a}, A^ fc) {a}, . . . , A (Ba) {<5}, A (Bb) {<5}, . . .) (5.3) 

can be defined. Then this modified list can be used to calculate the projected correlation 
function 



C M (t) = ^ E F ( - N) (A^{6}, . . . , AM{£}, ...)■ L M (A^{6}, . . . , A^{£}, ■ ■ ■)• 

(5.4) 



Here the sum goes over all tuples of antisymmetric sets of indices. Only those components 
of F_ contribute to the correlation function for which the corresponding component of 

L is nonzero. The number of contributing components is always smaller or equal to the 
number of components that would be necessary to evaluate if the complete correlation 
function were of interest. This fact can be exploited in the generation of the lists of 
operations A (1) , A (2) , . . . , A (iV) . 



6 Multiple sources/sinks 



In the previous sections the number of quark sources was set to one. In this case due to 
the Pauli-principle the maximal number of baryons is restricted in such a way that only 
12 quarks of each flavor are allowed in the system. This restriction can be circumvented 
by introducing additional quark sources. When we have N s mutually orthogonal quark 
sources, an additional source-index s ranging from 1 to N s can be introduced to each 
quark operator. Then the baryon operators at the source also have to be modified 
accordingly, 

B a]S = e abc (TO^fo)^ [fe) 7;M (r 2 r 5 (g 3 ) 5; n (6-1) 

As a consequence, the generation of the objects G B has to be modified in such a way 
that the fact that all quarks of a baryon originate from the same source is respected 
This can be achieved by the straightforward modification: the indices ^ 9i ' are promoted 



Although this restriction is not necessary for the algorithm described here, it is introduced to keep 
our notation simple and to allow for a cleaner presentation of the algorithm. The generalization to 
other cases is straightforward. 
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to combined spinor-color-source indices which range form 1 to 12N S . Then the modified 
G B becomes 

G B (a, s; C {qi) ,&\^ q3) ) ■= a'.*^ 1 V' a « (,s) > G B (a; k(^), k(^)), 

(6.2) 

where s(£) is the source part of the combined index £ and is the spinor-color 
part of £. The sink part also needs to be modified by using different sink functions 
si(x), S2(x), ... in the generation of the blocks /jj 1 ' q2,q3 (t, 5; a, (3, 7; a, b, c). 

To simplify the notation combined spinor-source indices \ and ip can be introduced to 
replace the former spinor indices of the baryons at the source and at the sink, respectively. 
Thus, the objects G and the modified blocks J| 1,92 > 93 can be written as 



(6.3) 



G B (x; C (qi \ £ (q2 \ £ (<?s) ) = 5<^' a ^ lqi ^5 8 ^'^ 92 ^5 a ^'^ q3 ^ 

■G B (a( X y,<& ) ),<& ) ),<& ) )), 



Here s(x) is the source part of the index Xi S (V0 is t ne sm k P ar t of the index t/>, and 
a(x) and a(V>) are the spinor parts of the indices x an d ^, respectively. 



The recursion relations (4.3) and (4.7) for L and F_ remain valid with the only difference 
that f^' qb ' qc and Gb are used instead of f q ^' qb ' qc and Gb- The correlation function can 
be calculated similarly to (4.8) by evaluating the contraction 

= jt E f™(a<*>m,...,a^^ (6.5) 

Ate){£} 

Here the only difference is that the result C (Ar) (t; A (Ba) {^}, . . . , A (n) {x}, • • •) instead of 
only spin indices now possesses combined source/sink- spinor indices. 

The upper bounds for the computational effort for the construction of the tensors F_ 
can be generalized in a straightforward way to 

P{n%\n™, • • •) = 11^?^. - 4?) U C «^ l2N ° - < ] ) ( 6 - 6 ) 
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and 



Qs>g?,n£\...) = n 

i 

'ml ( m ) 



C(n { ™\0,W s -n { r h 



x 



n 



C(n%\lAN s -l 
, 0, 12N S - nj m) 



(m)\ 
B; / 



for 2 7^ fc 
for i = fc 



for j k 

C(n { ™\ N( qj , B k ), 12N S - n { ™ ] - Nfa, B k )) for j = k. 



Am) 



(6.7) 



7 Atomic nuclei 

In this section the important special case of atomic nuclei, that is, systems consisting of 
protons and neutrons is discussed. For the calculation the nucleon operators 

Pa = Sabc (TllapUp-a [u T ,b(r 2 )jsds;c], (7.1a) 

N a = e abc (ri) a pdp. a [u r:b (T 2 ) 7 sds;c\ (7.1b) 
are used at the sink and the operators 

P a = e abc (T^u** [u'-\Y 2 y 5 t'\ (7.1c) 

AT" = £ abc ( ri )a^!« [y7;6( r2 )7<5/' c ] ( 71d ) 

are used at the source. There are two common choices for the set of matrices (ri,r 2 ). 
The first choice, Ti = 1 and T 2 = C75, where C is the charge conjugation matrix, 
gives fully relativistic nucleon operators. In case of the second choice, T\ = P nr and 
T 2 = Cj 5 P nr , the projection P nr = (l+7 4 )/2 to the "non-relativistic" spinor components 
is inserted. In this case, if the Dirac representation of the 7-matrices is used, only the 
upper two spinor components contribute to the operators. Therefore, by using non- 
relativistic operators the computational effort can be reduced significantly. 

Introducing the variables rip and to denote the number of protons and neutrons in 
the system, the recursion relations of adding one proton or one neutron can be written 

as 

L (n P +l,n N ) = L (n P ,n N ) # (j ^ 

L {n P ,n N +l) = L (n P ,n N ) # ^ ^ ^ 

p(n P +l,n N ) _ p(n P ,n N ) ^ ju,u,d 

p(n P ,n N +l) _ p(n P ,n N ) ^ jd,u,d ^ 2^ 

with the starting conditions either 

L^=G N and F { ° A) = f% u ' d (7.3a) 
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or 

L^ = G P and F< 1,0) = fp u ' d . (7.3b) 
The upper bound for the number of components of F_ at each stage is 

P(np, n N ) = C(np, D — n P )C(n N , D — n N ) 

C{2np + n N , 3D — 2n P — n^)C{n P + 2n N , 3D — n P — 2n N ), (7.4) 

where D denotes the effective number of spinor components. For relativistic operators 
D = 4 and for non-relativistic operators D = 2. The upper bounds for the number of 
operations to add a proton or a neutron to F^ ap ' nN ^ are 

Qp(np,riN) = (np,l, D — 1 — np)C(riN, D — n^)C{2np + njv, 2, 3D — 2 — 2np — tin) 
C{n P + 2n N , 1, 3D — 1 — n P — 2n7v), (7.5a) 

QN(np,riN) = (np, D — np)C(ri]sr,l, D — 1 — n^)C{2np + n^, 1, 3D — 1 — 2np — ri/v) 
C(n P + 2n JV ,2,3£)-2-np-2n JV ). (7.5b) 

To calculate the tensor f^ p,Nn ^ for fixed values of Np and iV^r one can choose several 



different orders of the recursion operations (7.2a -7. 2d). One could for example start 
with adding only neutrons until F^' Nn ^ is reached and then start to add only protons 
until the final result jp^ Np,Nn ' [ s obtained. For a first estimate of which order is best 
the values of the function P(n P ,n N ) at the intermediate stages can be used. Figure [2] 
gives these values for relativistic operators and compares two possible "paths" leading 
to the same state. It can be seen that for a given A = + np the tensors with a 
minimal number of components correspond to A = n^ or A = np. After investigating 
Qp(np, Tijv) and Qn^p, n^) it was found that the estimated operation count is minimal 
when first all the baryons of one type, e.g. neutrons, are added before adding any of the 
other type. Several different paths for several different nuclei have been tried numerically 
and it was found that even with the reduction due to the not required components these 
paths were still the most efficient in all tested cases. 

The lists of operations Ap P,njv ' ) and \^ p ' nN ' ) for the addition of protons or neutrons to 
p(n P ,n N ) kayg been constructed explicitly for a broad range of nuclei. These lists are 
different for each choice of the numbers Np and AOv even for the same intermediate 
stages (np,n7v) due to the fact that different components can be ignored depending on 
Np and N^. In each case the path corresponding to adding the Nn neutrons before the 
addition of the Np protons was chosen. The generation of the lists takes less then an 
hour on a standard desktop computer for all nuclei accessible with one quark source and 
relativistic operators. In the non relativistic case the generation of the lists takes less 
then 0.1 seconds on the same computer. 

The computational effort of generating the correlation functions of atomic nuclei is dom- 
inated by the operations necessary to construct the tensors F_. All other computational 
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Figure 2: Two possible paths for the recursion relation to reach F_ ' in the case of 
relativistic operators. The green (solid) path is more efficient than the red (dashed) path. 
The numbers in each box are the values of P(np,njv), that is, the upper bound for the 
number of components of p(™ p ' nN \ 



tasks, such as the summation in equation (4.8) or the construction of fp> u > d an d ffj u,d 



can be neglected. Table [T] shows the number of operations required for the construction 
of atomic nuclei with one quark source and relativistic operators. Here each operation is 
an element of a list A required for the given nucleus and amounts to a complex addition 
and multiplication. The numbers are also compared to the naive numbers of operations 
that would be required if one evaluated all Wick-contractions and spinor and color loops. 
The same information for the case of non-relativistic operators is shown in Table [2j A 
calculation with non-relativistic operators and two quark sources according to Section [6] 
was also performed. The resulting numbers of operations are shown in Table |3j 

These numbers show that the calculation of the correlation function of atomic nuclei 
using the recursive approach discussed in this paper is by many orders of magnitude 
more efficient than the naive computation. 



8 Comparison of efficiency 

In this section we compare the efficiency of the procedure described in the previous 
sections with the unified contraction algorithm introduced in Ref. [8] and the determinant 
method introduced in Ref. [7]. 

The method of Ref. [8] requires the construction of a unified list of contractions. This 
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list is identical to the tensor L^ Np ' Nn ^ used in this paper except that it is not stored in an 
antisymmetrized form. Hence the number of entries iVii St is by a factor n u \ nj n s \ larger 
than the number of entries in L^ Np ' Nn \ The generation of the unified contraction list 
in Ref. [8] is done by explicitly applying all possible permutations of quark indices. The 
effort associated with these permutations scales with n u \ n^. n s \ and even for systems of 
moderate sizes the generation of the list requires supercomputers. In contrast to this our 
recursive construction of the similar objects L requires about a second on todays desktop 
computers for all nuclei accessible with one quark source and relativistic operators. 

Once the list is generated the number of evaluations of f^ 1,q2,g3 " > required on each gauge 
configuration is AN\ ist /2 A = AN contT . In our approach the number of evaluations of 
yfai,<?2,<?3) j g e q Ua ] ^ the number of elements N\ in the operation list A required for the 
recursion. Therefore, the ratio of the efficiency of our algorithm and that of the unified 
contraction algorithm can be roughly estimated as AN contr /N\. 

To be able to compare our method with the performance numbers listed in Ref. (8j the 
different spin components have to be computed separately. We stress that this is less 
efficient than calculating all components at once if one is interested in all components. 
Individual spin components can be computed using the technique described in Section 
[5} If the spin indices of the desired component are (71, 72, . . . , ja) at the source and 
(7i, 72, • • • , 7a) at the sink then the projection tensor 

M?Jl:f A A = 6 S ^6 S ™ . . . 6 5 -^ 6 aiYi 6 a2 y 2 . . . 6 aA y A (8.1) 

is to be used. Since only the choice for the spin component at the source enters the 
calculation of the correlation function one can choose 7$ = 7^ for the efficiency compari- 
son. 

Table [4] shows the comparison between our method and the unified contraction algo- 
rithm in the non-relativistic case. Table [5] presents the same comparison in the case of 
relativistic operators. In Table [6] the efficiency in the case of non-relativistic operators 
and two quark sources is shown. For the sake of readability not all components are 
tabulated in the case of relativistic operators. More precisely only components with 
the minimal possible number of lower half spinor indices are listed. This includes all 
non-zero components calculated in Ref. [8). The computational effort associated with 
the components not listed is in all cases of the same order of magnitude as the listed 
components with the same Np and N^. 

The method of Ref. [7) uses determinants for the calculation of the quark level permuta- 
tions in the correlation function for a fixed structure of color-/spinor- indices and spatial 
location of the operators at the source and at the sink. This algorithm scales as 

n 3 u n 3 d n 3 s ■ N W N^ } (8.2) 

where N w and N' w are the number of such independent structures up to permutations of 
quarks at the source and at the sink, respectively. This algorithm is especially efficient 
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in the case where the numbers of quarks n u , n<i and n s are chosen such that all possible 
spinor- and color-DoFs are fully saturated both at the source and at the sink. In this 
case N w = N' w = 1. Such combinations of operators can be found for the nuclei 4 He, 
8 Be, 12 C, 16 and 28 Si for which concrete results are presented in Ref. [7j. However, 
in the general case the numbers N w and may become very large and in fact scale 
exponentially as already noted in Ref. [7]. 

If the individual baryons are to have a complex spatial structure, which is required e.g. 
for the projection to a definite momentum and angular momentum, it is very difficult 
to find a combination of operators for which N w and N' w remain small. In such cases 
the algorithm presented in this paper, which can perform the construction of baryon 
blocks with complex spatial structure in advance of the calculation, can be more advan- 
tageous. 

9 Summary 

We introduced a procedure to compute the correlation functions of multi-baryon systems 
using a recursive technique. In a first step a list of required components is generated 
in a recursive manner, where several antisymmetry properties of the list are exploited. 
In a second step a product of blocks of quark propagators is constructed recursively 
on each gauge configuration. During the construction both the antisymmetry of this 
product and the reduction due to the sparse nature of the list of required components 
are exploited at each intermediate step. 

The individual baryons in the system can be projected to any momentum state prior 
to the calculation, e.g. to zero momentum to extract the ground state mass. Different 
quark sources and baryon sinks can be used to create correlation functions with spatial 
structure and an arbitrary number of baryons in the system. The procedure can be 
employed for a broad range of multi-baryon systems. Systems with quantum numbers 
of atomic nuclei were discussed in detail. 

Our technique was compared in detail with the naive method and the method proposed 
in (8) and it was found that a significant speedup in all cases with A > 2 is possible. 
For the construction of the 4 He and 8 Be correlation functions with relativistic opera- 
tors (9(10 8 ) operations are required. In the non-relativistic approximation for 4 He the 
required number of operations is only O(10 A ). 
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Tables 



Table 1: Number of operations (each operation is a complex multiplication and addition) 
required to compute all independent spinor components of the correlation function with 
Np protons and Nn neutrons with one quark source and relativistic operators. Both 
the naive number and the number using the recursive approach are given, rj is the gain 
factor, that is, the ratio of the naive and the recursive numbers of operations. 



N P 


N N 


No. of op. 


Naive No. of op. 


V 







2 


199584 


15925248 


79.8 







3 


5825088 


8.3 x 10 11 


1.4 x 


10 5 





4 


54768672 


1.1 x 10 17 


1.9 x 


10 9 


1 


1 


474048 


11943936 


25.2 




1 


2 


19241280 


5.5 x 10 11 


2.9 x 


10 4 


1 


3 


109789200 


6.7 x 10 16 


6.1 x 


10 8 


1 


4 


179769600 


1.7 x 10 22 


9.2 x 


10 13 


2 


2 


531321120 


5.7 x 10 16 


1.1 X 


10 8 


2 


3 


756897264 


1.3 x 10 22 


1.7 x 


10 13 


2 


4 


291957888 


5.3 x 10 27 


1.8 x 


10 19 


3 


3 


2905079520 


4.9 x 10 27 


1.7 x 


10 18 


3 


4 


404946240 


3.0 x 10 33 


7.5 x 


10 24 


4 


4 


448496928 


2.8 x 10 39 


6.2 x 


10 30 



Table 2: Number of operations (each operation is a complex multiplication and addition) 
required to compute all independent spinor components of the correlation function with 
Np protons and Nn neutrons with one quark source and non-relativistic operators. Both 
the naive number and the number using the recursive approach are given, rj is the gain 
factor, that is, the ratio of the naive and the recursive numbers of operations. 



N P 


N N 


No. of op. 


Naive No. of op. 


V 





2 


504 


995328 


1974.9 


1 


1 


2664 


746496 


280.2 


1 


2 


6048 


8.6 x 10 9 


1.4 x 10 6 


2 


2 


10980 


2.2 x 10 14 


2.0 x 10 10 
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Table 3: Number of operations (each operation is a complex multiplication and addition) 
required to compute all independent spinor components of the correlation function with 
Np protons and Njy neutrons with two quark sources and non-relativistic operators. 
Both the naive number and the number using the recursive approach are given, rj is the 
gain factor, that is, the ratio of the naive and the recursive numbers of operations. 



N P 


N N 


No. of op. 


Naive No. of op. 


V 





2 


3024 


995328 


329.1 
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1052136 


1.3 x 10 10 


1.2 x 10 4 
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18881568 


4.2 x 10 14 


2.2 x 10 7 
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70.1 
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8.6 x 10 9 


2.0 x 10 5 
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2.6 x 10 14 


4.1 x 10 7 
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1.6 x 10 19 


2.4 x 10 11 
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2.2 x 10 14 


2.1 x 10 9 


2 


3 


10038672 


1.3 x 10 19 


1.3 x 10 12 


2 


4 


81850128 


1.3 x 10 24 


1.6 x 10 16 


3 


3 


338263368 


1.3 x 10 24 


3.5 x 10 15 


3 


4 


287427384 


1.9 x 10 29 


6.5 x 10 20 


4 


4 


448496928 


4.2 x 10 34 


9.5 x 10 25 



Table 4: The efficiency of the presented algorithm for the calculation of individual spin 
components with non-relativistic operators. N\ is the number of operations (complex 
multiplications and additions) required for the construction of F^. Nl is the number of 
independent components of the tensor L. Nn st and N con t r are the number of entries in the 
unified contraction list and the number of independent contractions from Ref. fM ; respec- 
tively. (Numbers that are not presented in are from own calculations). AN contr ./N\ 
is approximately the ratio of the number of operations required for the unified contraction 
algorithm and for the algorithm presented in this paper if additions are taken as much 
faster then multiplications. 



N P 


N N 


spinstate 


N A 


N L 


N m 


-^contr 


AN COQtr ./N A 
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(0,1) 


504 


21 
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1 


1 
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(0,0) 
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21 
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32400 


11.8 
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